Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 146
Filter
1.
Elife ; 132024 Apr 26.
Article in English | MEDLINE | ID: mdl-38666763

ABSTRACT

A crucial event in sexual reproduction is when haploid sperm and egg fuse to form a new diploid organism at fertilization. In mammals, direct interaction between egg JUNO and sperm IZUMO1 mediates gamete membrane adhesion, yet their role in fusion remains enigmatic. We used AlphaFold to predict the structure of other extracellular proteins essential for fertilization to determine if they could form a complex that may mediate fusion. We first identified TMEM81, whose gene is expressed by mouse and human spermatids, as a protein having structural homologies with both IZUMO1 and another sperm molecule essential for gamete fusion, SPACA6. Using a set of proteins known to be important for fertilization and TMEM81, we then systematically searched for predicted binary interactions using an unguided approach and identified a pentameric complex involving sperm IZUMO1, SPACA6, TMEM81 and egg JUNO, CD9. This complex is structurally consistent with both the expected topology on opposing gamete membranes and the location of predicted N-glycans not modeled by AlphaFold-Multimer, suggesting that its components could organize into a synapse-like assembly at the point of fusion. Finally, the structural modeling approach described here could be more generally useful to gain insights into transient protein complexes difficult to detect experimentally.


Subject(s)
Membrane Proteins , Animals , Male , Mice , Humans , Membrane Proteins/metabolism , Membrane Proteins/genetics , Membrane Proteins/chemistry , Spermatozoa/physiology , Spermatozoa/metabolism , Immunoglobulins/genetics , Immunoglobulins/metabolism , Immunoglobulins/chemistry , Sperm-Ovum Interactions/physiology , Female
2.
Bioinformatics ; 40(1)2024 01 02.
Article in English | MEDLINE | ID: mdl-38175787

ABSTRACT

MOTIVATION: Understanding metal-protein interaction can provide structural and functional insights into cellular processes. As the number of protein sequences increases, developing fast yet precise computational approaches to predict and annotate metal-binding sites becomes imperative. Quick and resource-efficient pre-trained protein language model (pLM) embeddings have successfully predicted binding sites from protein sequences despite not using structural or evolutionary features (multiple sequence alignments). Using residue-level embeddings from the pLMs, we have developed a sequence-based method (M-Ionic) to identify metal-binding proteins and predict residues involved in metal binding. RESULTS: On independent validation of recent proteins, M-Ionic reports an area under the curve (AUROC) of 0.83 (recall = 84.6%) in distinguishing metal binding from non-binding proteins compared to AUROC of 0.74 (recall = 61.8%) of the next best method. In addition to comparable performance to the state-of-the-art method for identifying metal-binding residues (Ca2+, Mg2+, Mn2+, Zn2+), M-Ionic provides binding probabilities for six additional ions (i.e. Cu2+, Po43-, So42-, Fe2+, Fe3+, Co2+). We show that the pLM embedding of a single residue contains sufficient information about its neighbours to predict its binding properties. AVAILABILITY AND IMPLEMENTATION: M-Ionic can be used on your protein of interest using a Google Colab Notebook (https://bit.ly/40FrRbK). The GitHub repository (https://github.com/TeamSundar/m-ionic) contains all code and data.


Subject(s)
Metals , Proteins , Proteins/chemistry , Amino Acid Sequence , Binding Sites , Ions , Protein Domains , Metals/chemistry , Metals/metabolism
3.
Commun Chem ; 6(1): 229, 2023 Oct 25.
Article in English | MEDLINE | ID: mdl-37880344

ABSTRACT

The computational design of peptide binders towards a specific protein interface can aid diagnostic and therapeutic efforts. Here, we design peptide binders by combining the known structural space searched with Foldseek, the protein design method ESM-IF1, and AlphaFold2 (AF) in a joint framework. Foldseek generates backbone seeds for a modified version of ESM-IF1 adapted to protein complexes. The resulting sequences are evaluated with AF using an MSA representation for the receptor structure and a single sequence for the binder. We show that AF can accurately evaluate protein binders and that our bind score can select these (ROC AUC = 0.96 for the heterodimeric case). We find that designs created from seeds with more contacts per residue are more successful and tend to be short. There is a relationship between the sequence recovery in interface positions and the plDDT of the designs, where designs with ≥80% recovery have an average plDDT of 84 compared to 55 at 0%. Designed sequences have 60% higher median plDDT values towards intended receptors than non-intended ones. Successful binders (predicted interface RMSD ≤ 2 Å) are designed towards 185 (6.5%) heteromeric and 42 (3.6%) homomeric protein interfaces with ESM-IF1 compared with 18 (1.5%) using ProteinMPNN from 100 samples.

4.
J Struct Biol ; 215(4): 108023, 2023 12.
Article in English | MEDLINE | ID: mdl-37652396

ABSTRACT

Tandem Repeat Proteins (TRPs) are a class of proteins with repetitive amino acid sequences that have been studied extensively for over two decades. Different features at the level of sequence, structure, function and evolution have been attributed to them by various authors. And yet many of its salient features appear only when looking at specific subclasses of protein tandem repeats. Here, we attempt to rationalize the existing knowledge on Tandem Repeat Proteins (TRPs) by pointing out several dichotomies. The emerging picture is more nuanced than generally assumed and allows us to draw some boundaries of what is not a "proper" TRP. We conclude with an operational definition of a specific subset, which we have denominated STRPs (Structural Tandem Repeat Proteins), which separates a subclass of tandem repeats with distinctive features from several other less well-defined types of repeats. We believe that this definition will help researchers in the field to better characterize the biological meaning of this large yet largely understudied group of proteins.


Subject(s)
Proteins , Tandem Repeat Sequences , Proteins/genetics , Proteins/chemistry , Tandem Repeat Sequences/genetics , Amino Acid Sequence
5.
Proc Natl Acad Sci U S A ; 120(33): e2305393120, 2023 08 15.
Article in English | MEDLINE | ID: mdl-37556498

ABSTRACT

Toxin-antitoxin (TA) systems are a large group of small genetic modules found in prokaryotes and their mobile genetic elements. Type II TAs are encoded as bicistronic (two-gene) operons that encode two proteins: a toxin and a neutralizing antitoxin. Using our tool NetFlax (standing for Network-FlaGs for toxins and antitoxins), we have performed a large-scale bioinformatic analysis of proteinaceous TAs, revealing interconnected clusters constituting a core network of TA-like gene pairs. To understand the structural basis of toxin neutralization by antitoxins, we have predicted the structures of 3,419 complexes with AlphaFold2. Together with mutagenesis and functional assays, our structural predictions provide insights into the neutralizing mechanism of the hyperpromiscuous Panacea antitoxin domain. In antitoxins composed of standalone Panacea, the domain mediates direct toxin neutralization, while in multidomain antitoxins the neutralization is mediated by other domains, such as PAD1, Phd-C, and ZFD. We hypothesize that Panacea acts as a sensor that regulates TA activation. We have experimentally validated 16 NetFlax TA systems and used domain annotations and metabolic labeling assays to predict their potential mechanisms of toxicity (such as membrane disruption, and inhibition of cell division or protein synthesis) as well as biological functions (such as antiphage defense). We have validated the antiphage activity of a RosmerTA system encoded by Gordonia phage Kita, and used fluorescence microscopy to confirm its predicted membrane-depolarizing activity. The interactive version of the NetFlax TA network that includes structural predictions can be accessed at http://netflax.webflags.se/.


Subject(s)
Antitoxins , Bacterial Toxins , Antitoxins/genetics , Bacterial Toxins/metabolism , Prokaryotic Cells/metabolism , Operon/genetics , Computational Biology , Bacterial Proteins/genetics , Bacterial Proteins/metabolism
6.
Bioinformatics ; 39(7)2023 07 01.
Article in English | MEDLINE | ID: mdl-37405868

ABSTRACT

MOTIVATION: Despite near-experimental accuracy on single-chain predictions, there is still scope for improvement among multimeric predictions. Methods like AlphaFold-Multimer and FoldDock can accurately model dimers. However, how well these methods fare on larger complexes is still unclear. Further, evaluation methods of the quality of multimeric complexes are not well established. RESULTS: We analysed the performance of AlphaFold-Multimer on a homology-reduced dataset of homo- and heteromeric protein complexes. We highlight the differences between the pairwise and multi-interface evaluation of chains within a multimer. We describe why certain complexes perform well on one metric (e.g. TM-score) but poorly on another (e.g. DockQ). We propose a new score, Predicted DockQ version 2 (pDockQ2), to estimate the quality of each interface in a multimer. Finally, we modelled protein complexes (from CORUM) and identified two highly confident structures that do not have sequence homology to any existing structures. AVAILABILITY AND IMPLEMENTATION: All scripts, models, and data used to perform the analysis in this study are freely available at https://gitlab.com/ElofssonLab/afm-benchmark.


Subject(s)
Computational Biology , Protein Conformation , Computational Biology/methods
7.
Sci Adv ; 9(18): eadf9297, 2023 05 03.
Article in English | MEDLINE | ID: mdl-37134173

ABSTRACT

G protein-coupled receptors (GPCRs) control critical cellular signaling pathways. Therapeutic agents including anti-GPCR antibodies (Abs) are being developed to modulate GPCR function. However, validating the selectivity of anti-GPCR Abs is challenging because of sequence similarities among individual receptors within GPCR subfamilies. To address this challenge, we developed a multiplexed immunoassay to test >400 anti-GPCR Abs from the Human Protein Atlas targeting a customized library of 215 expressed and solubilized GPCRs representing all GPCR subfamilies. We found that ~61% of Abs tested were selective for their intended target, ~11% bound off-target, and ~28% did not bind to any GPCR. Antigens of on-target Abs were, on average, significantly longer, more disordered, and less likely to be buried in the interior of the GPCR protein than the other Abs. These results provide important insights into the immunogenicity of GPCR epitopes and form a basis for designing therapeutic Abs and for detecting pathological auto-Abs against GPCRs.


Subject(s)
Receptors, G-Protein-Coupled , Signal Transduction , Humans , Receptors, G-Protein-Coupled/metabolism , Antigens , Epitopes
8.
Curr Opin Struct Biol ; 80: 102594, 2023 06.
Article in English | MEDLINE | ID: mdl-37060758

ABSTRACT

In Dec 2020, the results of AlphaFold version 2 were presented at CASP14, sparking a revolution in the field of protein structure predictions. For the first time, a purely computational method could challenge experimental accuracy for structure prediction of single protein domains. The code of AlphaFold v2 was released in the summer of 2021, and since then, it has been shown that it can be used to accurately predict the structure of most ordered proteins and many protein-protein interactions. It has also sparked an explosion of development in the field, improving AI-based methods to predict protein complexes, disordered regions, and protein design. Here I will review some of the inventions sparked by the release of AlphaFold.


Subject(s)
Protein Domains , Protein Conformation
9.
BMC Biol ; 21(1): 47, 2023 02 28.
Article in English | MEDLINE | ID: mdl-36855050

ABSTRACT

BACKGROUND: NorQ, a member of the MoxR-class of AAA+ ATPases, and NorD, a protein containing a Von Willebrand Factor Type A (VWA) domain, are essential for non-heme iron (FeB) cofactor insertion into cytochrome c-dependent nitric oxide reductase (cNOR). cNOR catalyzes NO reduction, a key step of bacterial denitrification. This work aimed at elucidating the specific mechanism of NorQD-catalyzed FeB insertion, and the general mechanism of the MoxR/VWA interacting protein families. RESULTS: We show that NorQ-catalyzed ATP hydrolysis, an intact VWA domain in NorD, and specific surface carboxylates on cNOR are all features required for cNOR activation. Supported by BN-PAGE, low-resolution cryo-EM structures of NorQ and the NorQD complex show that NorQ forms a circular hexamer with a monomer of NorD binding both to the side and to the central pore of the NorQ ring. Guided by AlphaFold predictions, we assign the density that "plugs" the NorQ ring pore to the VWA domain of NorD with a protruding "finger" inserting through the pore and suggest this binding mode to be general for MoxR/VWA couples. CONCLUSIONS: Based on our results, we present a tentative model for the mechanism of NorQD-catalyzed cNOR remodeling and suggest many of its features to be applicable to the whole MoxR/VWA family.


Subject(s)
AAA Proteins , Paracoccus denitrificans , Molecular Chaperones , Norethindrone , Structure-Activity Relationship
10.
PNAS Nexus ; 2(2): pgac303, 2023 Feb.
Article in English | MEDLINE | ID: mdl-36743470

ABSTRACT

How the self-assembly of partially disordered proteins generates functional compartments in the cytoplasm and particularly in the nucleus is poorly understood. Nucleophosmin 1 (NPM1) is an abundant nucleolar protein that forms large oligomers and undergoes liquid-liquid phase separation by binding RNA or ribosomal proteins. It provides the scaffold for ribosome assembly but also prevents protein aggregation as part of the cellular stress response. Here, we use aggregation assays and native mass spectrometry (MS) to examine the relationship between the self-assembly and chaperone activity of NPM1. We find that oligomerization of full-length NPM1 modulates its ability to retard amyloid formation in vitro. Machine learning-based structure prediction and cryo-electron microscopy reveal fuzzy interactions between the acidic disordered region and the C-terminal nucleotide-binding domain, which cross-link NPM1 pentamers into partially disordered oligomers. The addition of basic peptides results in a tighter association within the oligomers, reducing their capacity to prevent amyloid formation. Together, our findings show that NPM1 uses a "grappling hook" mechanism to form a network-like structure that traps aggregation-prone proteins. Nucleolar proteins and RNAs simultaneously modulate the association strength and chaperone activity, suggesting a mechanism by which nucleolar composition regulates the chaperone activity of NPM1.

11.
Bioinformatics ; 39(2)2023 02 03.
Article in English | MEDLINE | ID: mdl-36692145

ABSTRACT

MOTIVATION: Protein-protein interaction (PPI) networks and transcriptional regulatory networks are critical in regulating cells and their signaling. A thorough understanding of PPIs can provide more insights into cellular physiology at normal and disease states. Although numerous methods have been proposed to predict PPIs, it is still challenging for interaction prediction between unknown proteins. In this study, a novel neural network named AFTGAN was constructed to predict multi-type PPIs. Regarding feature input, ESM-1b embedding containing much biological information for proteins was added as a protein sequence feature besides amino acid co-occurrence similarity and one-hot coding. An ensemble network was also constructed based on a transformer encoder containing an AFT module (performing the weight operation on vital protein sequence feature information) and graph attention network (extracting the relational features of protein pairs) for the part of the network framework. RESULTS: The experimental results showed that the Micro-F1 of the AFTGAN based on three partitioning schemes (BFS, DFS and the random mode) on the SHS27K and SHS148K datasets was 0.685, 0.711 and 0.867, as well as 0.745, 0.819 and 0.920, respectively, all higher than that of other popular methods. In addition, the experimental comparisons confirmed the performance superiority of the proposed model for predicting PPIs of unknown proteins on the STRING dataset. AVAILABILITY AND IMPLEMENTATION: The source code is publicly available at https://github.com/1075793472/AFTGAN. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Neural Networks, Computer , Software , Proteins/chemistry , Amino Acid Sequence , Protein Interaction Maps
12.
Nat Struct Mol Biol ; 30(2): 216-225, 2023 02.
Article in English | MEDLINE | ID: mdl-36690744

ABSTRACT

Cellular functions are governed by molecular machines that assemble through protein-protein interactions. Their atomic details are critical to studying their molecular mechanisms. However, fewer than 5% of hundreds of thousands of human protein interactions have been structurally characterized. Here we test the potential and limitations of recent progress in deep-learning methods using AlphaFold2 to predict structures for 65,484 human protein interactions. We show that experiments can orthogonally confirm higher-confidence models. We identify 3,137 high-confidence models, of which 1,371 have no homology to a known structure. We identify interface residues harboring disease mutations, suggesting potential mechanisms for pathogenic variants. Groups of interface phosphorylation sites show patterns of co-regulation across conditions, suggestive of coordinated tuning of multiple protein interactions as signaling responses. Finally, we provide examples of how the predicted binary complexes can be used to build larger assemblies helping to expand our understanding of human cell biology.


Subject(s)
Protein Interaction Maps , Signal Transduction , Humans , Mutation , Computational Biology/methods
13.
Nat Struct Mol Biol ; 29(11): 1056-1067, 2022 11.
Article in English | MEDLINE | ID: mdl-36344848

ABSTRACT

Most proteins fold into 3D structures that determine how they function and orchestrate the biological processes of the cell. Recent developments in computational methods for protein structure predictions have reached the accuracy of experimentally determined models. Although this has been independently verified, the implementation of these methods across structural-biology applications remains to be tested. Here, we evaluate the use of AlphaFold2 (AF2) predictions in the study of characteristic structural elements; the impact of missense variants; function and ligand binding site predictions; modeling of interactions; and modeling of experimental structural data. For 11 proteomes, an average of 25% additional residues can be confidently modeled when compared with homology modeling, identifying structural features rarely seen in the Protein Data Bank. AF2-based predictions of protein disorder and complexes surpass dedicated tools, and AF2 models can be used across diverse applications equally well compared with experimentally determined structures, when the confidence metrics are critically considered. In summary, we find that these advances are likely to have a transformative impact in structural biology and broader life-science research.


Subject(s)
Computational Biology , Furylfuramide , Computational Biology/methods , Binding Sites , Proteins/chemistry , Databases, Protein , Protein Conformation
14.
Nat Commun ; 13(1): 6028, 2022 10 12.
Article in English | MEDLINE | ID: mdl-36224222

ABSTRACT

AlphaFold can predict the structure of single- and multiple-chain proteins with very high accuracy. However, the accuracy decreases with the number of chains, and the available GPU memory limits the size of protein complexes which can be predicted. Here we show that one can predict the structure of large complexes starting from predictions of subcomponents. We assemble 91 out of 175 complexes with 10-30 chains from predicted subcomponents using Monte Carlo tree search, with a median TM-score of 0.51. There are 30 highly accurate complexes (TM-score ≥0.8, 33% of complete assemblies). We create a scoring function, mpDockQ, that can distinguish if assemblies are complete and predict their accuracy. We find that complexes containing symmetry are accurately assembled, while asymmetrical complexes remain challenging. The method is freely available and accesible as a Colab notebook https://colab.research.google.com/github/patrickbryant1/MoLPC/blob/master/MoLPC.ipynb .


Subject(s)
Monte Carlo Method , Proteins , Proteins/metabolism
15.
Mol Cell Proteomics ; 21(10): 100413, 2022 10.
Article in English | MEDLINE | ID: mdl-36115577

ABSTRACT

The assembly of proteins and peptides into amyloid fibrils is causally linked to serious disorders such as Alzheimer's disease. Multiple proteins have been shown to prevent amyloid formation in vitro and in vivo, ranging from highly specific chaperone-client pairs to completely nonspecific binding of aggregation-prone peptides. The underlying interactions remain elusive. Here, we turn to the machine learning-based structure prediction algorithm AlphaFold2 to obtain models for the nonspecific interactions of ß-lactoglobulin, transthyretin, or thioredoxin 80 with the model amyloid peptide amyloid ß and the highly specific complex between the BRICHOS chaperone domain of C-terminal region of lung surfactant protein C and its polyvaline target. Using a combination of native mass spectrometry (MS) and ion mobility MS, we show that nonspecific chaperoning is driven predominantly by hydrophobic interactions of amyloid ß with hydrophobic surfaces in ß-lactoglobulin, transthyretin, and thioredoxin 80, and in part regulated by oligomer stability. For C-terminal region of lung surfactant protein C, native MS and hydrogen-deuterium exchange MS reveal that a disordered region recognizes the polyvaline target by forming a complementary ß-strand. Hence, we show that AlphaFold2 and MS can yield atomistic models of hard-to-capture protein interactions that reveal different chaperoning mechanisms based on separate ligand properties and may provide possible clues for specific therapeutic intervention.


Subject(s)
Amyloid beta-Peptides , Amyloid , Humans , Amyloid/chemistry , Amyloid/metabolism , Amyloid beta-Peptides/chemistry , Amyloid beta-Peptides/metabolism , Prealbumin , Deuterium , Ligands , Molecular Chaperones/metabolism , Mass Spectrometry , Machine Learning , Thioredoxins , Lactoglobulins , Pulmonary Surfactant-Associated Proteins
16.
Protein Sci ; 31(6): e4333, 2022 06.
Article in English | MEDLINE | ID: mdl-35634779

ABSTRACT

The advent of machine learning-based structure prediction algorithms such as AlphaFold2 (AF2) and RoseTTa Fold have moved the generation of accurate structural models for the entire cellular protein machinery into the reach of the scientific community. However, structure predictions of protein complexes are based on user-provided input and may require experimental validation. Mass spectrometry (MS) is a versatile, time-effective tool that provides information on post-translational modifications, ligand interactions, conformational changes, and higher-order oligomerization. Using three protein systems, we show that native MS experiments can uncover structural features of ligand interactions, homology models, and point mutations that are undetectable by AF2 alone. We conclude that machine learning can be complemented with MS to yield more accurate structural models on a small and large scale.


Subject(s)
Furylfuramide , Machine Learning , Ligands , Mass Spectrometry/methods , Proteins/chemistry
18.
Proteins ; 90(7): 1493-1505, 2022 07.
Article in English | MEDLINE | ID: mdl-35246997

ABSTRACT

Scoring docking solutions is a difficult task, and many methods have been developed for this purpose. In docking, only a handful of the hundreds of thousands of models generated by docking algorithms are acceptable, causing difficulties when developing scoring functions. Today's best scoring functions can significantly increase the number of top-ranked models but still fail for most targets. Here, we examine the possibility of utilizing predicted interface residues to score docking models generated during the scan stage of a docking algorithm. Many methods have been developed to infer the regions of a protein surface that interact with another protein, but most have not been benchmarked using docking algorithms. This study systematically tests different interface prediction methods for scoring >300.000 low-resolution rigid-body template free docking decoys. Overall we find that contact-based interface prediction by BIPSPI is the best method to score docking solutions, with >12% of first ranked docking models being acceptable. Additional experiments indicated precision as a high-importance metric when estimating interface prediction quality, focusing on docking constraints production. Finally, we discussed several limitations for adopting interface predictions as constraints in a docking protocol.


Subject(s)
Proteins , Software , Algorithms , Benchmarking , Molecular Docking Simulation , Protein Binding , Protein Conformation , Protein Interaction Mapping/methods , Proteins/chemistry
19.
Nat Commun ; 13(1): 1265, 2022 03 10.
Article in English | MEDLINE | ID: mdl-35273146

ABSTRACT

Predicting the structure of interacting protein chains is a fundamental step towards understanding protein function. Unfortunately, no computational method can produce accurate structures of protein complexes. AlphaFold2, has shown unprecedented levels of accuracy in modelling single chain protein structures. Here, we apply AlphaFold2 for the prediction of heterodimeric protein complexes. We find that the AlphaFold2 protocol together with optimised multiple sequence alignments, generate models with acceptable quality (DockQ ≥ 0.23) for 63% of the dimers. From the predicted interfaces we create a simple function to predict the DockQ score which distinguishes acceptable from incorrect models as well as interacting from non-interacting proteins with state-of-art accuracy. We find that, using the predicted DockQ scores, we can identify 51% of all interacting pairs at 1% FPR.


Subject(s)
Computational Biology , Proteins , Computational Biology/methods , Protein Conformation , Proteins/metabolism
20.
NAR Genom Bioinform ; 4(1): lqac001, 2022 Mar.
Article in English | MEDLINE | ID: mdl-35118376

ABSTRACT

Changes in DNA methylation have been found to be strongly correlated with age, enabling the creation of 'epigenetic clocks'. Previously, studies on the relationship between ageing and DNA methylation have assumed a linear relationship. Here, we show that several markers show a non-linear behaviour. In particular, we observe a tendency for saturation with age, especially in the cerebellum. Further, we show that the relationships between significant methylation changes and ageing are different in different tissues. We suggest a straightforward method of assessing all methylation-age relationships and cluster them according to their relative fold change. Our fold change selection outperforms the most common epigenetic clocks in predicting age for the cerebellum, but not for Blood or the Frontal Cortex. Further, we find that the saturation of methylation observed at older ages for the cerebellum explains why epigenetic clocks consistently underestimate the age there. The findings imply that assuming linear correlations might cause biologically important markers to be missed.

SELECTION OF CITATIONS
SEARCH DETAIL
...